Greg Detre
Monday, April 28, 2003
�It may well be that the way to build an intelligence is just to get your hands on dirty engineering problems. We don't have a theory of automobiles. We have good cars, but there are no fundamental equations of automotive science.� - Hans Moravec[1]
In short, I find Minsky�s descriptions of the problems and possible solutions to AI common sense highly congenial. Describing common sense knowledge as �all the things that don�t need to be stated� neatly sums up our intuitions. As Lenat[2] remarked, �within a few months we realised that what [encyclopaedias] contain is almost the complement of common sense�. When thinking about common sense, Minsky emphasises the following ideas:
a large and diverse compendium of methodologies, facts and
representations;
knowing �a little about a lot�;
the consequent need to rely on ambiguity, analogy, metaphor, association
and other strategies;
lots of exceptions, inconsistencies, inherently buggy, and yet very
robust.
I don�t
disagree with any of these points. Instead, I�m going to discuss a set of
related issues that are problematic for any common sense or large-scale
intelligent system, which I believe might bear further analysis. One way into
these issues is to look at the problem of common sense reasoning as a problem
of search. Let us assume, for the moment, that the knowledge we need to answer
most common sense questions about a simple story has been encoded in some declarative
form, e.g. as a series of natural language facts, or predicate calculus
formulae. A real problem of combinatorial explosion arises � unlike a search
through chess moves, for example, the kinds of search common sense reasoning
requires usually tends to be shallow but potentially much much wider. As Henry
Lieberman put it, the size of human common sense knowledge is around the �small
end of infinity� (estimates from 20m to several billions).
However,
even brief introspection shows that however we manage common sense reasoning,
we seem to do it with great ease and rapidity. I�m going to argue that perhaps
a greater emphasis on simulation and models may help with some aspects of the
problems of relevance, context and combinatorial explosion.
Perhaps we shouldn�t visualise common sense knowledge as a warehouse of facts, but as a set of integrated models of the world. This cuts across any facts vs methods distinction, since facts can be generated from instantiating a� model with parameters, and methods devised or tested within a model. Perhaps common sense cannot be captured by any realistic number of assertions (and the brain doesn�t do it this way) � those assertions are the symptoms rather than the cause of common sense. We generate those assertions, but we cannot generate more assertions from them without being able to plug parameters into our model of the world. But because we don't want to have to fully instantiate a complete model of the world every time we consider how to deal with a particular closed situation, we divide our world model into many sub-models of different domains and at different levels. The same particular situation may be multiply represented (if you feed in the right sets of parameters to each model-representation) to give us robustness and flexibility, and to allow us to deal with uncertainty and incomplete information.
This sounds very much like a standard description of a frame-array with common terminals, and frames and scripts are certainly vital representations in any common sense system. But frames alone can't do everything, and I don't think Minsky intends that they should. Imagining a domain close to my heart, a soccer game requires me to know a great deal about the way my body works, how the ball flies, and how people move in order to be played well. The niceties of soccer strategies might not seem to really come under the description of common sense (at least in America) until we consider that if we know enough about it, we can generate all sorts of useful common sense assertions like: it involves somewhere between two and approximately twenty people, it takes place in a confined space of a given size, it can be used as a basis for understanding lots of other sports and competitions etc. We could program in all these facts, but people have a much richer understanding of them from just participating themselves once.
I am concerned though that frames are overly linguistic/symbolic, and cannot deal well with noisy or incomplete data in high-dimensional or continuous procedures or situations. The most important such realms/areas that I can think of are motor/body knowledge, local geography, na� physics and emotional and social interactions. These comprise a very large proportion of common sense knowledge. In all such cases, I believe we can store such knowledge much more efficiently in some procedural form, and generate the relevant common sense assertions by simulation. Indeed, simulation theory is one of the leading theories of �theory of mind�, i.e. our ability to understand and predict others� behaviour by simulating how we would feel in their situation if we had their beliefs and desires. I am not sure how literally I am arguing that we imagine what we would do if we were a glass of water teetering on the edge of a table, but we definitely have the procedural/modelling ability to do this, and it�s difficult to see how a declarative knowledge base could ever be rich or exhaustive enough to demonstrate the robustness that our common sense knowledge has.
Knowing which model to use at a given time, and which default parameters, is the problem of context. In a sense, the problem of context is the problem of common sense. Common sense on this view is the business of knowing when to apply which rules and exceptions, when to override default parameters, and which level of granularity is sufficient. As Minsky puts it:
��Lift�, for example, has different implications when an object weighs one gram, or a thousand, and really to understand lifting requires a network for making appropriate shifts among such different micro-senses�.
Of course, I have put the case for procedural over declarative common sense representations more strongly than it should be. Multiple representations are clearly the way forward, as long as we have a means of moving between them.
The second
major point I want to make relates to the way that we think of �common sense
knowledge�, in terms of how we define it and its psychological qualities.
It feels
salient somehow that there is a common, well-known word/phrase in English for
�common sense�, whose meaning when applied to different situations is itself
common sensical, and which feels as though it�s pointing to some neatly
delineable set of representations or methods. In view of this, I hoped to be
able to come up with a definition that even more neatly captured the
distinction between common sense and intelligence.
Starting
with Minsky�s slightly tongue in cheek definition of intelligence as being
what�s exhibited �when you see someone do something you want to be able to do�,
I contrasted this with a preliminary redefinition of common sense as:
the baseline level of performance on tasks in
an environment you�re used to
This was
intended to capture the fact that common sense tends to be used as a kind of
binary measure of whether or not some ability or knowledge should be
expected of every adult. I will return to this notion of common sense as a
threshold measure of difficulty below.
By adding
the caveat �in an environment you�re used to�, where environment is intended to
encompass everything from jungles to classes, I was hoping to make room for the
kind of relativisation of common sense by culture and specialisation/expertise
that we clearly see. For instance, it�s common sense to computer scientists
that if I�ve only typed a couple of characters and yet my C compiler suddenly
reports tens of errors that I should start by look for a missed semi-colon or
bracket, but that wouldn�t be common sense to a novice programmer. In a similar
fashion, the common sense that I have would be only barely applicable if I was
wandering around in a space suit on the moon. Think of the things that my
common sense would get wholly wrong � my estimations of the way my body moves
and the physical limitations of what is possible, my �naive physics�
understanding about how flags move in the wind, chemicals react, how to
survive, social interactions with other astranauts � these sound like small
things, but if we were to take a less extremely different environment, the
usefulness of our common sense would be proportionately affected. Our
environment is defined in terms of our embodiment and the way it limits and
affects our interactions, which is why common sense is so human-centred. Common
sense is domain-specific expertise where the domain is the everday.
The hope
was that I could contrast this with intelligence, which is a more �inventive�
process for solving more novel or unexpected problems, perhaps in unfamiliar
domains, perhaps in the abstract, for learning new rules and facts quickly and
relating them together, often involving a deeper search and more difficult
cognitive abilities like dynamic chunking and recursion.
Although this definition of common sense works reasonably well, I�m going to try to argue for a stronger statement, namely that common sense is:
a minimum set of optimised representations that
allow us to be �intelligent� in a new knowledge-domain
It seems
key to me that the best way to pin down common sense knowledge is as the things
that are always left implicit and unstated. The assumption that Minsky makes is
that this is because it�s knowledge that everyone shares, and so it would be
redundant to write it down every time. I briefly entertained the idea that
common sense is knowledge that doesn�t get written down because it can�t
easily be written down, or because our linguistic concepts bootstrap themselves
out of our common sense knowledge � but of course, common sense can�t be wholly
sub- or non-linguistic since we can put it into words for the most part if we
try (witness Open Mind).
So instead,
one possibility is that the threshold of common sense is the point at which
words and concepts become rich enough internally and strongly inter-connected
enough for us to be able to talk about them and use them with ease, familiarity
and confidence. The suggestion is that at the core of any human
knowledge-domain is a set of models, methodologies, scripts and facts which
have been expanded, optimised and inter-referenced to allow rapid searching to
only a couple of levels deep, with exceptions and inconsistencies already
clearly marked and bounded. These common sense cores may use a different, more
redundant representation than knowledge in domains in which our common sense
doesn�t apply. I�m imagining something analogous to the way in which important
speed-intensive loops or inline functions are optimised by modern compilers.
This would
fit with our vague intuition that common sense knowledge is somehow distinct
from normal knowledge, and also from the kinds of abilities that we consider intelligent,
and that this difference can be seen from the outside both as a measure of
performance and in terms of the familiarity of the task. It would also explain
why one can�t be intelligent in a domain without first having some common sense
notions of how to operate within it.
Strangely enough, my initial idea was to try and conflate common sense and intelligence, viewing them as the same processes being applied to either familiar or novel problems. I took the view that more or less all of the ideas about how we learn and ways to think were applicable to common sense knowledge. I also took the view that any sort of approach that focused on storing knowledge, rather than generating knowledge from simulations and models would probably face an impossible combinatorial explosion, and that predicate calculus in particular was singularly ill-suited to the task of indexing the knowledge by relevance.
I then independently argued for a view of common sense knowledge as being a highly-optimised, expanded, dense core at the heart of various familiar domains on which the processes we usually consider �intelligent� depend.
I now can�t decide whether a preference for procedural, simulation-based representations is consistent with this latter view. It may be that while procedural representations are more memory-efficient, declarative knowledge (e.g. a vast storehouse of �facts�) would be better suited to this optimisation of highly-relevant common sense knowledge.
---
On a side note, I wasn�t sure whether the use of the term �intentionality� in section 6.4.1 was quite how it�s usually used in philosophy. Intentionality in philosophy is summarised well by Daniel Dennett[3]:
�Some things are about other things: a belief can be about icebergs, but an iceberg is not about anything � The term was coined by the Scholastics in the Middle Ages, and derives from the Latin verb intendo, meaning to point (at) or aim (at) or extend (toward). Phenomena with intentionality point outside themselves, in effect, to something else: whatever they are of or about.�
It feels as though a more appropriate word for the seemingly wilful, purposive kind of� intention (in the non-philosophical sense) that goals are described as having in section 6.4.1 would be �teleology�. Unfortunately, here the alliterative allure of the �intensity of intentionality� may have proven a false friend :).